Skip to content

MDEV-38412 System tablespace fails to shrink due to legacy tables#4884

Open
Thirunarayanan wants to merge 1 commit into11.4from
MDEV-38412
Open

MDEV-38412 System tablespace fails to shrink due to legacy tables#4884
Thirunarayanan wants to merge 1 commit into11.4from
MDEV-38412

Conversation

@Thirunarayanan
Copy link
Copy Markdown
Member

Problem:

InnoDB system tablespace autoshrink can fail when legacy tables exist in the system tablespace (space_id=0). These are typically old tables from earlier InnoDB versions that doesn't have '/' character in the table name.

Solution:

During system tablespace autoshrinking, InnoDB now proactively drops unknown/legacy tables that meet the following criteria:

  • Table name does not contain '/'
  • Table is not an InnoDB system table. In that case, delete the record from system tables.

drop_tables_by_filter(): Scan and drop tables by predicate

  • Used this function to drop the garbage table during restore

  • drop the unknown tables from system tablespace

  • Added DBUG_EXECUTE_IF("create_dummy_sys_tables") for testing the scenario.

This cleanup happens automatically before the autoshrink operation, preventing failures and allowing the system tablespace to be properly truncated.

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

DBUG_RETURN(2);
}

static bool should_drop_garbage_table(const rec_t* rec, ulint len)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing noexcept, and it’d be better to use size_t instead of the alias ulint. There is no comment explaining the return value and the parameters.

Comment on lines +1390 to +1393
@param[in] rec SYS_TABLES record containing the table name
@param[in] len length of the table name
@return true if the table should be dropped, false otherwise */
static bool should_drop_unknown_table(const rec_t* rec, ulint len)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing noexcept. A space around the * is misplaced. Please, no [in] in new code. The @return comment should not mention individual return values; we have @retval for that.

Comment on lines +1399 to +1401
const byte *field= rec_get_nth_field_old(rec, DICT_FLD__SYS_TABLES__ID, &id_len);
if (dict_sys.is_sys_table(mach_read_from_8(field)))
return false;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if expression is missing id_len != 8 ||.

Comment on lines +1301 to +1303
if (dict_table_t *table= dict_sys.load_table(
{reinterpret_cast<const char*>(pcur.old_rec), len},
DICT_ERR_IGNORE_DROP))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we have some entries in SYS_INDEXES but the table cannot be loaded because the entries in SYS_TABLES, SYS_COLUMNS, SYS_FIELDS are incorrect? In that case, we would fail to drop the table.

Do we really have to load a table definition into the cache? Would it suffice to unconditionally execute some simpler SQL to delete the corresponding entries from SYS_TABLES, SYS_INDEXES, SYS_FIELDS, SYS_COLUMNS, during a slow shutdown when :autoshrinkis enabled? What really matters here is a call todict_drop_index_tree(). That could be executed by row_purge_remove_clust_if_poss_low()` as part of the slow shutdown.

The basic algorithm would be like this:

  1. Collect the table ID from SYS_TABLES and report the table names. (Check for SYS_TABLES.SPACE=0 directly from each record.)
  2. For each table ID, execute the SQL to DELETE FROM SYS_… WHERE SPACE=0 AND TABLE_ID=:id.
  3. Let the purge run into completion. It will take care of invoking dict_drop_index_tree(), and it’s already checking for SYS_INDEXES.PAGE=FIL_NULL there.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +1326 to +1329
if (err == DB_SUCCESS)
{
trx->commit();
ut_ad(deleted.empty());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted.empty() should hold irrespective of the err value.

Comment on lines +1365 to +1366
for (pfs_os_file_t d : deleted)
os_file_close(d);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be no handles of deleted files to close, because these tables are located in the system tablespace (fil_system.sys_space), which will not be deleted.

Comment on lines +1403 to +1404
sql_print_information("InnoDB: Dropping the unknown table %.*s",
static_cast<int>(len), rec);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int(len) is shorter and equivalent to static_cast<int>(len).

Problem:
=======
 - InnoDB system tablespace fails to autoshrink when it contains
legacy internal tables. These are non-user tables and internal
table exist from older version. Because the current shrink logic
does recognize these entries as user table, they block the
defragmentation process required to reduce the tablespace size.

Solution:
=========
To enable successful shrinking, InnoDB has been updated to
identify and remove these legacy entries during the startup:

fsp_drop_legacy_tables(): A new function that scans the InnoDB
system tables for entries lacking the / naming convention.
It triggers the removal of these table objects from InnoDb system
tables and ensures the purge thread subsequently drops any
associated index trees in SYS_INDEXES.

delete_from_sys_table_entries(): Function is to remove specific
table IDs from internal system catalogs.

fsp_system_tablespace_truncate(): If legacy tables are detected,
InnoDB prioritizes their removal. To ensure data integrity and
complete the shrink, a two-restart sequence may be required:
1) purge the legacy table
2) Defragment the system tablespace and shrink the
system tablespace further
Comment on lines +5862 to +5865
bool drop_unknown= false;
err= fsp_drop_legacy_tables(drop_unknown);
if (err == DB_SUCCESS || drop_unknown == false)
err= space->defragment();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using an output parameter, can we use the special return value DB_SUCCESS_LOCKED_REC to indicate that some unknown tables were dropped?

Comment on lines +5693 to +5699
/** Determine if a table should be dropped during system tablespace
autoshrinking. This function identifies legacy tables in the
system tablespace that should be removed. A table is considered a
legacy/unknown table if the table name does not contain '/'
(indicating it's not a proper database/table name)
@return error code or DB_SUCCESS */
static dberr_t fsp_drop_legacy_tables(bool &drop_unknown) noexcept
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function as well as delete_from_sys_table_entries() would seem to logically belong to the compilation unit dict/drop.cc.

Comment on lines +5710 to +5713
const byte *field= nullptr;
ulint len= 0;
if (rec_get_deleted_flag(rec, 0))
continue;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assignments could be moved after the first if.

Comment on lines +5742 to +5743
sql_print_information("InnoDB: Found the unknown table %.*s",
(int) len, rec);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid C-style casts. int(len) is shorter too. The message does not say what we are going to do with the table:

    sql_print_information("InnoDB: DROP TABLE %.*s", int(len), rec);

I think that using the SQL syntax should make it clear even for DBAs who do not speak English.

Comment on lines +5763 to +5764
if (err == DB_SUCCESS)
trx->commit();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of assigning an output parameter drop_unknown, can we just assign err= DB_SUCCESS_LOCKED_REC here, to indicate that something was dropped?

Comment on lines +1464 to +1465
error= que_eval_sql(
nullptr, "PROCEDURE CREATE_DUMMY_1() IS\n"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation is a bit off here.

Comment on lines +92 to +97
--source include/restart_mysqld.inc
let SEARCH_PATTERN=InnoDB: Dropping the unknown table;
--source include/search_pattern_in_file.inc

let SEARCH_PATTERN=InnoDB: Found the unknown table;
--source include/search_pattern_in_file.inc
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also look for the specific names of the garbage tables here? We could be outputting some garbage, and the test would not catch that.

Can we also check the contents of INFORMATION_SCHEMA.INNODB_SYS_TABLES and friends to show that the dummy tables appear and disappear as intended?

Comment on lines +5751 to +5758
sql_print_information("InnoDB: Dropping the unknown tables");
trx_t *trx= trx_create();
trx_start_for_ddl(trx);
dict_sys.lock(SRW_LOCK_CALL);

for (table_id_t table_id : unknown_tables)
{
err= delete_from_sys_table_entries(table_id, trx);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we only need to lock dict_sys for invoking the SQL parser. There is no need to hold the latch across the transaction commit. What we are really missing here is a call to lock_sys_tables(trx) after the creation of the transaction.

Comment on lines +5746 to +5748
mtr.commit();
if (err || unknown_tables.size() == 0)
return err;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need mtr or pcur after this point. Could this be split into two functions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants